Model Selection

Multi-scenario Adaptation

# Multi-scenario Adaptation

Devstral Small 2505 GGUF

Quantized version of Devstral-Small-2505, offering multiple precision options to adapt to different hardware requirements

Large Language Model Supports Multiple Languages

Ultravox V0 5 Llama 3 2 1b GGUF

Ultravox v0.5 is an audio-to-text model optimized from the Llama-3 2.1B architecture, focusing on efficient speech transcription tasks.

Speech Recognition

AM Thinking V1 GGUF

AM-Thinking-v1 is a text generation model based on the GGUF format, suitable for various natural language processing tasks.

Large Language Model

Andrewzh Absolute Zero Reasoner Coder 7b GGUF

Llamacpp quantized version based on andrewzh's Absolute_Zero_Reasoner-Coder-7b model, supporting multiple quantization levels, suitable for reasoning and code generation tasks.

Large Language Model

Nousresearch.deephermes ToolCalling Specialist Atropos GGUF

DeepHermes-ToolCalling-Specialist-Atropos is a text generation model focused on tool calling, designed to achieve efficient task execution through natural language processing technology.

Large Language Model

Allura Org Remnant Glm4 32b GGUF

Remnant-GLM4-32B is a 32B-parameter large language model based on the GLM4 architecture, supporting role-playing and conversational interactions, particularly suitable for salamander-related applications.

Large Language Model

Multi2convai Quality De Bert

This is a Bert model optimized for German, focusing on text classification tasks in the quality domain.

Text Classification

Transformers German

Huihui Ai.glm 4 9B 0414 Abliterated GGUF

GLM-4-9B-0414-abliterated is a large language model with 9B parameters based on the GLM architecture, suitable for text generation tasks.

Large Language Model

TRELLIS Text Xlarge

TRELLIS Text XL is a large-scale text-conditioned 3D generation model that can generate corresponding 3D content based on the input text.

Text-to-Image English

TRELLIS Text Large

TRELLIS Text Large is a large-scale text-to-3D generation model that enables scalable and diverse 3D content generation based on a structured 3D latent space.

Text-to-Image English

Huihui Ai DeepSeek R1 Distill Llama 70B Abliterated GGUF

GGUF quantized version of DeepSeek-R1-Distill-Llama-70B-abliterated, suitable for local inference, offering multiple quantization options to meet different hardware requirements.

Large Language Model

Vitpose Plus Base

ViTPose is a vision Transformer-based human pose estimation model that achieves an outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark with a simple design.

Pose Estimation

Transformers English

A vision Transformer-based human pose estimation model achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set

Pose Estimation

Transformers English

Summllama3.1 8B GGUF

An 8B-parameter summary generation model optimized based on Llama3 architecture, offering multiple quantization versions

Large Language Model

Sam2 Hiera Tiny.fb R896 2pt1

SAM2 model based on the HieraDet image encoder, focusing on image feature extraction tasks.

Object Detection

Sam2 Hiera Base Plus.fb R896 2pt1

SAM2 model weights based on HieraDet image encoder, focused on image feature extraction tasks

Image Segmentation

Moonshine Base ONNX

ONNX-format automatic speech recognition model based on the Moonshine base model, supporting efficient inference

Speech Recognition

Wavlm Large Finetuned SER

A speech emotion recognition model based on WavLM-Large, supporting English speech emotion classification.

Audio Classification English

Jenna Ortega Flux

A LoRA model customized based on the FLUX.1-dev foundation model, specializing in generating realistic-style portraits of Jenna Ortega.

Pathumma Whisper Th Large V3

Pathumma Whisper Large V3 is a Thai automatic speech recognition model based on the OpenAI Whisper architecture, supporting Thai and English speech transcription tasks.

Speech Recognition

Transformers Supports Multiple Languages

Allegro is an open-source high-quality text-to-video generation model capable of producing 6-second detailed videos at 720x1280 resolution and 15 FPS.

Text-to-Video English

Belle Whisper Large V3 Turbo Zh

A Chinese speech recognition model fine-tuned based on whisper-large-v3-turbo, showing significant performance improvements in multiple Chinese speech recognition benchmarks

Speech Recognition

PGTFormer is an image-to-image transformation model based on PyTorch, integrated and pushed to Hugging Face Hub via PytorchModelHubMixin.

Image Generation

Reverb Diarization V2

Reverb Speaker Diarization V2 is a speaker diarization model based on pyannote-audio, outperforming the baseline pyannote3.0 model on multiple test sets.

Audio Processing

This model is used to convert image content into textual descriptions and is suitable for non-commercial purposes.

Text Recognition

add-detail-xl is a detail adjustment model for SDXL. It can increase or decrease image details by adjusting the weight, bringing more flexibility to image generation.

Image Generation

Moralbert Predict Subversion In Lyrics

This is a PyTorch-based text classification model suitable for various text classification tasks.

Text Classification

Image Captioning Vit Gpt2 Flick8k

This model can convert input images into descriptive text, suitable for image understanding tasks in various scenarios.

Detr Face Detection

A face detection model based on the CreativeML-OpenRAIL-M license, supporting the English language, primarily used for object detection tasks.

Object Detection

Transformers English

This model is an image-to-text model based on the Apache-2.0 license, capable of converting image content into textual descriptions.

Text Recognition

Gemma 7b Finetuned

A prompt optimization model fine-tuned using the QLORA method, specifically designed to enhance the clarity and effectiveness of text prompts.

Large Language Model

Parrots Chinese Hubert Base

The Chinese HuBERT base model is a pre-trained model for text-to-speech tasks, supporting Chinese speech processing.

Speech Synthesis

Transformers Chinese

Parrots Chinese Roberta Wwm Ext Large

Chinese pre-trained model based on RoBERTa architecture, supporting text-to-speech tasks

Large Language Model

Transformers Chinese

Imagecaptioningtransformers

This model can convert input images into descriptive text, suitable for various image content understanding tasks across multiple scenarios.

Image Generation

adityarajkishan

Openbuddy Deepseek 10b V17.1 4k GGUF

OpenBuddy is an open-source multilingual chatbot that supports communication in multiple languages.

Large Language Model Supports Multiple Languages

Belle Distilwhisper Large V2 Zh

A Chinese speech recognition model fine-tuned based on distilwhisper-large-v2, with a speed 5.8 times faster than whisper-large-v2 and 51% fewer parameters

Speech Recognition

Whisper Large V3 French Distil Dec8

This is a distilled version of the Whisper-Large-V3 French model, optimized for inference speed and memory usage by reducing the number of decoder layers while maintaining good performance.

Speech Recognition

Transformers French

Whisper Large V3 French

A French automatic speech recognition model fine-tuned based on OpenAI Whisper-large-v3, supporting case sensitivity, punctuation, and number prediction

Speech Recognition

Transformers French

Orionstar Yi 34B Chat Llama GGUF

OrionStar Yi 34B Chat Llama is a large language model based on the Yi 34B architecture, focusing on Chinese dialogue tasks.

Large Language Model Other

Noromaid 20b V0.1.1

Noromaid-20b-v0.1.1 is a large language model suitable for role-playing, emotional role-playing, and general scenarios, jointly developed by IkariDev and Undi.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase